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Abstract 

Background: Recent studies have investigated the contribution of copy number variants (CNVs) to disease 
susceptibility in a multitude of complex disorders, including systemic lupus erythematosus, Crohn's disease, and 
various neurodevelopmental disorders. Relatively few CNV studies, however, have been conducted on 
pharmacologic phenotypes even though these structural variants are likely to play an important role. We 
developed a genome-wide method to identify CNVs that contribute to heterogeneity in drug response, focusing 
on drugs that are widely used in anticancer treatment regimens. 

Results: We conducted a comprehensive genome-wide study of CNVs from population-scale array-based and 
sequencing-based surveys by analyzing their effect on cellular sensitivity to platinating agents and topoisomerase II 
inhibitors. We identified extensive CNV regions associated with cellular sensitivity to functionally diverse 
chemotherapeutics, supporting the hypothesis that variation in copy number contributes to variation in drug 
response. Interestingly, although single nucleotide polymorphisms (SNPs) tag some of the CNVs associated with 
drug sensitivity, several of the most significant CNV-drug associations are independent of SNPs; consequently, they 
represent genetic variations that have not been previously interrogated by SNP studies of pharmacologic 
phenotypes. 

Conclusions: Our findings demonstrate that pharmacogenomic studies may greatly benefit from the study of 
CNVs as expression quantitative trait loci, thus contributing broadly to our understanding of the complex traits 
genetics of CNVs. We also extend our PACdb resource, a database that makes available to the scientific community 
relationships between genetic variation, gene expression, and sensitivity to various drugs in cell-based models. 
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Background 

Copy number variants (CNVs) have received consider- 
able attention in recent years as studies have implicated 
them in a wide range of complex human phenotypes, 
including susceptibility to HIV-1/AIDS [1], Crohn's dis- 
ease [2], and various autoimmune disorders. The sys- 
tematic assessment of their role in the etiology of 
complex disease has been predicated on improvements 
in genotyping technologies (including SNP-based geno- 
typing arrays and clone-based comparative genomic 
hybridization) and on advances in algorithms for copy 
number analysis [3]. Genome-wide surveys of CNVs 
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[4,5] have sought to produce a comprehensive map to 
enable disease association studies, but a recent compre- 
hensive study reports a somewhat disappointing finding 
that CNVs are likely to make a relatively minor contri- 
bution to the genetic basis of complex traits [6], particu- 
larly disease susceptibility. 

While the study of the contribution of CNVs to drug 
response has lagged behind the investigation of their 
contribution to disease risk, there have been some nota- 
ble findings coming out of candidate gene approaches. 
The gene CYP2D6 encodes an enzyme to which the 
metabolism of a large number of drugs, such as antide- 
pressants, neuroleptics, analgetics and anticancer drugs, 
is attributed. It has been demonstrated that CYP2D6 
may occur in CNVs of 0 to 13 copies [7]. Studies have 
shown that copy number for this gene affects the plasma 
levels of the active metabolite of tamoxifen, namely 
endoxifen, so that ultra-rapid metabolizers who carry 
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more than two copies of the gene show much higher 
levels of endoxifen than those who carry the regular 
copy number for the gene [8]. Higher CYP2D6 activity 
due to gene amplification has also been shown to pre- 
dispose to life-threatening opioid intoxication [9]. 
Another drug metabolizing cytochrome P450 gene, 
CYP2A6, also occurs in variable copy number. CYP2A6 
encodes an enzyme that metabolizes several drugs, 
including nicotine and its metabolite cotinine. Increased 
CYP2A6 activity has been shown to be responsible for 
increased risk for nicotine addiction [10] and for 
tobacco-related cancers. The SULT family of Phase II 
conjugating enzymes, particularly that encoded by 
SULT1A1, has been the subject of extensive pharmaco- 
genetic studies that show the importance of CNVs as a 
genetic source of variability in the metabolic activity of 
these enzymes. SULT pharmacogenomic studies [11] 
have highlighted CNV-based mechanisms that lead to 
increased risk for chemical carcinogenesis and adverse 
drug reactions. Glutathione S-transferase (GST), also a 
phase II family of conjugation enzymes, plays an impor- 
tant role in the detoxification of drugs. Studies have 
shown that homozygous deletion of GSTM1 is corre- 
lated with increased cancer risk and with better treat- 
ment outcome [12,13]. These findings and related 
developments highlight the necessity of incorporating 
copy number analysis in elucidating the genetic under- 
pinnings of drug response. 

The recently released catalog [4] from an extensive 
survey of copy number regions assayed in cell lines 
from the International HapMap project and the subse- 
quent study of genomic structural variants based on 
whole genome DNA sequencing data (the 1000 Gen- 
omes Project) [14] allow for new pharmacogenomic dis- 
coveries and for deep insights into the genetic basis of 
pharmacologic phenotypes, which to date has largely 
been based on studies of SNPs [15]. In whole-genome 
studies using lymphoblastoid cell lines (LCLs), cellular 
sensitivity to drug [16] as well as gene expression phe- 
notypes [17] have been shown to be heritable [18] and 
to include a significant genetic component. Although 
many CNV pharmacogenetic studies have focused on 
pharmacokinetic genes, we chose to evaluate pharmaco- 
dynamic genes using an LCL-based model. Studies in 
our laboratory have generated a rich resource of phar- 
macologic data [19] on a wide array of chemotherapeu- 
tic agents using the HapMap cell lines, enabling us to 
conduct a systematic analysis of the role of CNVs for a 
variety of anticancer drugs. 

Results 

Genome-wide association studies 

LCLs from unrelated CEU samples were phenotyped for 
cellular sensitivity to the four chemotherapeutic drugs 



included in our study: carboplatin [20], cisplatin [21], 
daunorubicin [22], and etoposide [16]. We conducted 
genome-wide association scans using drug IC50 as a 
quantitative trait. 

A total of 5,238 CNVs from an array-based study [4] 
were evaluated in genome-wide association studies 
(GWAS) against cellular sensitivity drug phenotypes. Of 
these CNVs, 77% are deletions (0, 1, or 2 copy number), 
16% are amplifications (2, 3, or 4 copy number), and the 
remainder are multi-allelic (greater than 3 diploid copy 
number genotypes) [4]. At the nominally significant 
threshold of P < 0.05, we identified 67 CNVs associated 
with carboplatin IC50, 70 CNVs with cisplatin IC50, 73 
CNVs with daunorubicin IC50, and 113 CNVs with eto- 
poside IC 50 . 

Genomic characterization of drug susceptibility- 
associated CNVs 

We further evaluated the genomic characteristics of 
these drug susceptibility-associated CNVs for their size 
and type (deletion versus amplification). In general, 
there is little (Pearson) correlation between the size of a 
CNV and its association with cellular sensitivity to car- 
boplatin (r = 0.020), cisplatin (r = 0.008), daunorubicin 
(r = 0.054) and etoposide (r = 0.024). We did, however, 
observe that the top CNVs associated {P < 0.05) with 
IC 50 for daunorubicin are significantly smaller (average 
size of 10.6 kb) than expected (average size of 14 kb) 
from the full set of CNVs included in our study; etopo- 
side-associated CNVs are, in contrast, close to expecta- 
tion (average size of 13.4 kb). The CNVs associated with 
carboplatin and cisplatin IC 50 (average size of 11.2 kb 
and 11.4 kb, respectively) are significantly smaller than 
expected. 

Sixty-two of the 67 carboplatin-associated CNVs {P < 
0.05) are biallelic (that is, deletions or amplifications); 
the remaining five CNVs are multi-allelic CNVs (that is, 
defined as having more than three CNV genotypes). 
Deletions are significantly more frequent (85%) than 
duplications among the carboplatin-associated CNVs. 
Similarly, all but 4 of the 70 cisplatin-associated CNVs 
(P < 0.05) are biallelic. The top cisplatin-associated 
CNVs are significantly more likely to be deletions (87%) 
than duplications. 

All but 9 etoposide-associated CNVs (P < 0.05) are 
biallelic; 69 out of the 73 daunorubicin-associated CNVs 
{P < 0.05) are biallelic. Nearly 82% of the daunorubicin- 
associated CNVs and 82% of the etoposide-associated 
CNVs are deletions. 

Drug susceptibility-associated CNVs are enriched for 
expression quantitative trait loci 

We observed that no exons overlap the genomic regions 
defined by the top associated CNVs for each anticancer 
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drug included in our study, suggesting that these CNVs 
do not act to disrupt coding sequence. We thus 
hypothesized that these CNVs act to influence gene reg- 
ulation. We evaluated the functional import of the drug 
susceptibility-associated CNVs by considering their 
effect on global gene expression. At an expression asso- 
ciation threshold of P < 0.0001, 60% (N = 40) of the top 
CNVs associated with carboplatin (P < 0.05) were found 
to be expression quantitative trait loci (eQTLs). Interest- 
ingly, two of the top carboplatin-associated CNVs 
{CNVR3882.1 on chromosome 8 and CNVR666.1 on 
chromosome 2) predict the expression of SELL. We 
found that SELL expression level is also significantly 
correlated with carboplatin IC 50 (P = 0.02) in the CEU 
samples. We identified several target genes of carbopla- 
tin-associated CNVs (as eQTLs) whose expression levels 
were significantly correlated (after multiple testing cor- 
rection [23], false discovery rate (FDR) <0.05) with car- 
boplatin IC 50 , including PHGDH, MYOIB, TGFBR2, and 
PRF1. Similarly, nearly 56% (N = 39) of the cisplatin - 
associated CNVs (P < 0.05) were associated with the 
transcript level of genes as eQTLs. We found a target 
gene, MAST4, for two cisplatin-associated CNVs 
(CNVR2968.1 on chromosome 6 and CNVR7881.1 on 
chromosome 20). MAST4 trends toward significance (P 
= 0.06) with cisplatin IC 50 in the CEU samples. A target 
gene {C4A at P = 8.2 x 10" 6 ) for a cisplatin-associated 
CNV eQTL (CNVR4748.1 on chromosome 10) was 
found to be significantly correlated (after multiple test- 
ing correction [23], FDR <0.05) with cisplatin IC 50 . 

Restricting our analysis to biallelic CNVs, we found, 
through simulations, that the top CNVs, for each plati- 
nating agent, are significantly enriched for eQTLs rela- 
tive to frequency- matched SNPs (enrichment P < 0.05). 
The eQTL enrichment holds at a lower P-value thresh- 
old (P = 1Q' 6 ) used to define an eQTL, showing the 
robustness of our observation to the definition of eQTL. 
See Materials and methods for details on the simulation 
procedure. 

Of the top CNVs associated with etoposide IC50 (P < 
0.05), 76% (N = 86) were found to be eQTLs. Of these 
CNV eQTLs, eight share UBA1 as a target gene (P < 9.7 
x 10" 5 ). Two target genes (PLEKHG6 at P = 1.3 x 10~ 6 
and WSB2 at P = 8.0 x 10" 5 ) for etoposide-associated 
CNV eQTLs (CNVR4784.1 on chromosome 10 and 
CNVR1874.1 on chromosome 4, respectively) were 
found to be significantly correlated with etoposide IC 50 
(FDR < 0.01) [23]. Nearly 52% {N = 38 of the top CNVs 
associated with daunorubicin IC 50 (P < 0.05) were 
eQTLs. We identified two daunorubicin-associated 
CNVs (CNVR479.2 and CNVR332.1 on chromosome 1) 
predicting the expression of HIST1H4A {P < 6.7 x 10 s ); 
we also found the expression level of this gene to be 
correlated (P = 3.7 x 10" ) with daunorubicin IC 50 in 



the CEU samples. We identified several target genes 
(including PAPLN at P = 3.3 x 10 s and KLF12 at P = 
6.1 x 10' 5 ) for daunorubicin-associated CNV eQTLs 
(CNVR2616.1 on chromosome 5 and CNVR948.1 on 
chromosome 2, respectively) whose expression levels 
were significantly correlated (after multiple testing cor- 
rection [23], FDR < 0.05) with daunorubicin IC 50 . 

As in the case of the platinating agents, we found, 
through simulations, that the top CNVs for each topoi- 
somerase II inhibitor are more likely to be eQTLs than 
frequency-matched SNPs (enrichment P < 0.05). 

Functional characterization of transcripts c/s-regulated by 
deletions from whole genome sequencing data 

Given the observed high proportion of deletions 
among CNVs associated with cellular sensitivity to 
chemotherapeutic agents, we sought additional func- 
tional support for the role of CNVs as transcriptional 
regulators from whole genome sequencing data coming 
out of the 1000 Genomes project, which characterized 
the CNV deletions with Gencode/ENCODE transcripts 
[14]. The resulting enlarged catalog of CNVs (with an 
initial focus on deletions) included CNVs of size 50 bp 
or larger mapped at single nucleotide resolution. We 
identified 376 transcripts to which CNV deletions were 
annotated [14] (by Gencode/ENCODE) as influencing 
(c/s-regulating) transcription and/or translation. We 
proceeded to test the 376 transcripts for their role in 
predicting cellular sensitivity to chemotherapeutics. At 
P < 0.05, we found 21 transcript correlations with car- 
boplatin, 15 with cisplatin, 23 with daunorubicin, and 
21 with etoposide (see Table 1). Three transcripts 
(MOXD1, RCC1, SULF2) were significant after multiple 
testing adjustment (p aa f y - < 0.05, Bonferroni). Remark- 
ably, the three transcripts were the only CNV deletions 
associated with all four agents at the nominal P < 0.05 
threshold (Figure 1). 

Drug susceptibility-associated CNVs are independent of 
drug susceptibility-associated SNPs 

We investigated to what extent the CNVs associated 
with cellular sensitivity to chemotherapeutic agents may 
already be interrogated by SNP-based GWAS through 
linkage disequilibrium [6]. We found that the top CNV 
(CNVR1616.1) associated with carboplatin IC 50 (P = 5 x 
10" 4 ) is not well-tagged by SNPs. Indeed, the best proxy 
SNP for this CNV on chromosome 3 is rs967422 (at r 2 
= 0.075). We found that the same CNV is also asso- 
ciated with cisplatin IC 50 (P = 6.5 x 10" ). Another cis- 
platin-associated CNV (P = 5.5 x 10" 3 ), CNVR7870.1, is 
also not well-tagged; the best proxy SNP, rs915049, tags 
the CNV at a low r 2 = 0.11. In each case, the best proxy 
SNP, in contrast to the 'tagged' drug susceptibility-asso- 
ciated CNV, shows no evidence of being associated with 
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Table 1 Nominally significant (P < 0.05) gene expression Table 1 Nominally significant [P ?<? 0.05) gene expres- 
correlations with cellular sensitivity to chemotherapeutic sion correlations with cellular sensitivity to chemothera- 
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Gene 


Drug 


P-value 


Chromosome 


RCC1 


Daunorubicin 




0.00002 


1 


AKT1S1 


Carboplatin 


0.04497 


19 


RGC1 


Etoposide 




0.01358 


1 


AKT1S1 


Daunorubicin 


0.01149 


19 


RNF103 


Daunorubicin 




0.02655 


2 


AKT1S1 


Etoposide 


0.00610 


19 


RPN2 


Daunorubicin 




0.01206 


20 


ALKBH8 


Daunorubicin 


0.01151 


11 


SLC10A7 


Daunorubicin 




0.00481 


A 


AMYIA 


Etoposide 


0.03933 


1 


SLC4A8 


Cisplatin 




0.02189 


12 


AMY2A 


Etoposide 


0.03933 


1 


SLC4A8 


Etoposide 




0.00248 


12 


ANKLE 1 


Carboplatin 


0.03017 


19 


SMPD4 


Daunorubicin 




0.02446 


2 


ANKLE ] 


Daunorubicin 


0.01706 


19 


5NRK 


Carbop atin 




0.00396 


3 


ANKRD36B 


Daunorubicin 


0.02380 


2 


5NRK 


Cisplatin 




0.00479 


3 


BCLAF1 


Daunorubicin 


0.01498 


6 


SNRK 


Daunorubicin 




0.02429 


3 


C13orf3 


Carboplatin 


0.00830 


13 


5ULF2 


Carbop atin 




0.00002 


20 


C13orf3 


Cisplatin 


0.01980 


13 


SULF2 


Cisplatin 




0.00053 


20 


C13orf3 


Daunorubicin 


0.00572 


13 


SULF2 


Daunorubicin 




0.00014 


20 


CWorfl 


Cisplatin 


0.00398 


18 


SULF2 


Etoposide 




0.00000 


20 


CAB39L 


Etoposide 


0.04179 


13 


TBC1D22A 


Cisplatin 




0.03807 


22 


DIS3L2 


Daunorubicin 


0.03077 


2 


TBL1XR1 


Carbop atin 




0.04028 


3 


DNAJC5 


Carboplatin 


0.00166 


20 


TFDP1 


Cisplatin 




0.04967 


13 


DNAJC5 


Cisplatin 


0.04058 


20 


TNK2 


Carbop atin 




0.01901 


3 


FBRS 


Carboplatin 


0.00661 


16 


TNK2 


Daunorubicin 




0.00431 


3 


FGFR4 


Carboplatin 


0.00556 


5 


TNK2 


Etoposide 




0.01159 


3 


FKBP14 


Carboplatin 


0.03912 


7 


TYR03 


Etoposide 




0.02840 


15 


FLU 


Carboplatin 


0.04735 


11 


UPK3B 


Cisplatin 




0.02839 


7 


FLU 


Daunorubicin 


0.00314 


11 


WDR90 


Daunorubicin 




0.02575 


16 


FLU 


Etoposide 


0.04230 


11 


WDR90 


Etoposide 




0.001 1 1 


16 


GALNT1 


Daunorubicin 


0.00065 


18 


ZDHHC4 


Daunorubicin 




0.02463 


7 


GPRW7 


Etoposide 


0.00674 


9 


ZFAND2A 


Daunorubicin 




0.00227 


7 


GPR137 


Carboplatin 


0.00616 


1 1 


ZFAND2A 


Etoposide 




0.03567 


7 


GPR137 


Cisplatin 


0.00828 


1 1 


ZNF277 


Etoposide 




0.04521 


7 


GPR137 


Etoposide 


0.01493 


1 1 


ZNF33 1 


Carboplatin 




0.02122 


19 


G PR 144 


Etoposide 


0.04526 


9 


ZNF664 


Carboplatin 




0.04722 


12 


G5PT1 


Carboplatin 


0.04002 


16 


Transcripts are c/s-regulated by CNVs identified from whole-genome 


GSPT1 


Cisplatin 


0.01429 


16 


sequencing. 










HLA-DQA 1 


Daunorubicin 


0.00946 


6 












HLA-DQA 1 


Etoposide 


0.00022 


6 












IGLV3-21 


Carboplatin 


0.04800 


22 


cellular 


sensitivity to the 


dru£ 


; even 


at the nominal 


M0XD1 


Carboplatin 


0.00002 


6 


threshold of P = 0.05. 








M0XD1 


Cisplatin 


0.00002 


6 


In the 


case of the topoisomerase II inhibitors, of the 


M0XD1 


Daunorubicin 


0.03454 


6 


CNVs showing association with both etoposide and dau- 


M0XD1 


Etoposide 


0.00007 


6 


norubicin (P < 0.05), we found two - CNVR7205.1 and 


MTA1 


Etoposide 


0.01401 


M 


CNVR3293.1 - that are 


only 


modes 


tly tagged (by 


NC0A1 


Daunorubicin 


0.04278 


2 


rs563079 at r 2 = 0.77 and rsl7166803 also at r 2 = 0.77, 


NEK6 


Carboplatin 


0.04050 


9 


respectively). Neither rs563079 nor rsl7166803 is asso- 


NUB1 


Etoposide 


0.02671 


7 


ciated with etoposide or daunorubicin IC 50 . In contrast, 


PPP1R3B 


Cisplatin 


0.03039 


8 


CNVR2930.1, which is one 


of two etoposide-associated 


PTP4A2 


Carboplatin 


0.01930 


1 


CNVs predicting the expression of CCND1 (expression 
P = 2.4 x 10" 7 ), is perfectly tagged {r 2 = 1) by 


PTP4A2 


Cisplatin 


0.01182 


1 


PTP4A2 


Etoposide 


0.03348 


1 


rs9500270. We identified 


a daunorub 


icin-associated 


RCG1 


Carboplatin 


0.01252 


1 


CNV (CNVR2766.1; P = 3.7 x 10" 3 ) for which the best 


RCC1 


Cisplatin 


0.01722 


1 


proxy SNP, rsl0484327, tag 


s the CNV at only r 2 = 0.11. 
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Figure 1 Three transcripts c/s-regulated by CNVs identified by whole genome sequencing data predict cellular sensitivity to 
functionally diverse chemotherapeutics. Through ENCODE/Gencode annotation, 376 genes were recently identified [12] to be c/s-regulated 
by CNV deletions. At P < 0.05, we found 21 gene expression correlations with carboplatin, 15 with cisplatin, 23 with daunorubicin, and 21 with 
etoposide. Three transcripts {M0XD1 on chromosome 6, RCC1 on chromosome 1, SULF2I on chromosome 20) were significant after multiple 
testing adjustment (p ot # < 0.05) and were the only CNV deletions associated with all four agents at the nominal P < 0.05 threshold. The plots 
show level of expression (x axis) and IC 50 (y axis). 



PACdb: a database for cell-based pharmacogenomics 

PACdb [24] is a large-scale, publicly available genomic 
database, which to date holds the results of our SNP- 
based GWAS on the following chemotherapeutic agents: 
carboplatin, cisplatin, etoposide, daunorubicin, and 
cytarabine. PACdb implements a structured repository 
for incorporating other datasets, including information 
on other drugs, gene expression profiling, and cellular 
phenotypes. GWAS were initially conducted using SNP 
genotype data made available by the International Hap- 
Map project. We expanded PACdb to include the 
results of our CNV-based GWAS on carboplatin, cispla- 
tin, etoposide, and daunorubicin. Furthermore, the 
results of eQTL mapping of HapMap CNVs to tran- 
scriptional expression are made available in the eQTL 



repository SCAN. Figure 2 shows a schematic diagram 
of our approach to the discovery of CNVs associated 
with sensitivity to drug and to the identification of such 
CNVs that act as eQTLs; it also illustrates the genomic 
resources we have made publicly available to the scienti- 
fic community. 

CNVs and drug classes 

We evaluated to what extent the top CNV associations 
for a given drug would overlap with the top CNV asso- 
ciations for another drug belonging to the same che- 
motherapeutic drug class, defined in terms of 
mechanism of action. At the suggestive threshold of P < 
0.05, of the CNVs showing association with carboplatin 
IC 50 , 16% (n = 11) were also associated with cisplatin 
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Figure 2 A schematic of the approach to the discovery of drug susceptibility-associated CNVs We added to PACdb the CNV associations 
with cellular sensitivity to the chemotherapeutic agents. Note that some drug-associated CNVs (dCNVs) are poorly interrogated by SNPs and are 
thus independent of drug-associated SNPs (dSNPs). SCAN includes the results of eQTL mapping of CNVs to transcriptional expression. 



IC50. Thus, we see a significant overlap (P = 7.7 x 10' ) 
between the (two) sets of CNVs associated with cellular 
sensitivity to the platinating agents. Figure 3 illustrates a 
duplication (CNVR7826 Jull on chromosome 20) that is 
associated with sensitivity to carboplatin (Figure 3a) and 
to cisplatin (Figure 3b); note that the observed genotype 
associations with the platinums have concordant direc- 
tion. Furthermore, the CNV is an eQTL predicting the 
expression of GSR (P = 4.67 x 10 s ) and SPARC (P = 
4.70 x 10' 5 ). Remarkably, the expression levels of these 
target mRNAs, GSR {P = 0.045) and SPARC (P = 0.004), 
are correlated with carboplatin IC50; similarly, GSR (P = 
0.005) and SPARC (P = 0.005) are correlated with cispla- 
tin IC 50 . Glutathione reductase (GSR) has been impli- 
cated in several studies of platinum sensitivity [25,26]. 

In the case of the topoisomerase II inhibitors, 12% of 
the etoposide-associated CNVs were found to associate 
with daunorubicin IC50, and the observed overlap is still 
quite significant (P = 2.7 x 10" 10 ). The slightly greater 
percentage of overlap for the platinating agents is not 
due to higher phenotypic correlation (platinating agents 
(r = 0.52) versus topoisomerase II inhibitors (r = 0.69)). 

Real-time PCR validation 

We sought additional experimental support for the 
genes targeted by multiple CNVs associated with drug 
susceptibility. We identified two etoposide-associated 
CNV eQTLs that share CCND1 as a target gene 
(expression P = 2.4 x 10' 7 ). The over-expression of 



CCND1 has been shown to be associated with the up- 
regulation of the GST-n gene, increasing the sensitivity 
of a cancer cell line to etoposide [27]. We found 
CCND1 expression to be significantly correlated with 
etoposide IC 50 {P = 7.8 x 10" ) in the CEU samples. 
After multiple testing correction, the gene remained 
significant (q- value = 0.0027). We subsequently con- 
ducted functional validation of the role of CCND1 
expression in altering sensitivity to etoposide by per- 
forming real-time quantitative-PCR assays in an inde- 
pendent set of 52 CEPH LCLs (Figure 4; see Table SI 
in Additional file 1 for the real-time PCR data on 
CCND1). Consistent with the direction of effect in the 
CEU samples, increased CCND1 mRNA levels resulted 
in increased IC 50 (P = 0.05) in the validation set. Thus, 
increasing CCND1 expression confers resistance to 
etoposide. 

Discussion 

Understanding in a comprehensive manner the genetic 
risk factors contributing to variation in drug response is 
a crucial component of the realization of personalized 
medicine. The drugs evaluated in our study are widely 
used in the treatment of many cancer types, including 
ovarian, colorectal, testicular, and lung; all are associated 
with particular toxicities and resistance. Although SNPs 
have long been used in association studies to elucidate 
the effect of genetic polymorphisms on drug response, 
CNVs have been relatively understudied. Recent 
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Figure 3 A CNV associated with cellular sensitivity to both carboplatin and cisplatin. We identified an amplification CNV, CNVR7826_full on 
chromosome 20, that is associated with both carboplatin sensitivity (a) (P = 0.0056) and cisplatin sensitivity (b) (P = 0.0023). The x-axis is the 
copy number (3 or 4) for the CNV; the y-axis is the log 2 -transformed IC 50 . 



genome-wide surveys of CNVs have now established 
that these structural variants are a common phenom- 
enon in the human genome [5]. With rapid advances in 
methods that facilitate their assay and analysis, variation 
in copy number for genes encoding drug metabolizing 
enzymes has been increasingly implicated for their dra- 
matic consequences on responsiveness to drugs. Such 
CNVs have been observed to alter gene dosage and are 
thus likely to play an important role in determining 
drug efficacy or toxicity. 



In this study, we set out to utilize recent develop- 
ments in the assay of CNVs in recent population-scale 
projects, including an extensive comparative genomic 
hybridization-based catalog of CNVs [4] and a map of 
structural variants based on whole genome DNA 
sequencing data (the 1000 Genomes Project) [14], in 
order to evaluate the role of CNVs in cellular sensitivity 
to chemotherapeutic agents. The cell lines for the sam- 
ples express a sizable part of the genome [28], thus 
enabling the investigation of genes represented in 
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Figure 4 The relationship between CCND1 expression and etoposide IC 50 in the discovery set and in an independent replication set of 
52 CEPH LCLs. (a) Discovery set; (b) independent replication set of 52 CEPH LCLs. Panel (b) describes the expression- IC 50 relationship using 
real-time expression; the relationship is consistent with that in the discovery set of CEU samples. 



biologically relevant pathways. While the cancer genome 
is clearly necessary for understanding chemotherapeutic 
response, the importance of germline genetic variation 
in drug sensitivity has also been consistently demon- 
strated [15]. 

For each drug included in our study, we found that 
the top associated CNVs are more likely to act as 
eQTLs and predict transcript levels than minor allele 
frequency (MAF)-matched SNPs. The overlap of the 
drug susceptibility-associated CNVs with expression- 
associated CNVs (eCNVs) is greater than is expected, 
based on simulation studies. Consistent with a previous 
report [6], CNVs associated with cellular sensitivity to 
drug treatment are not likely to overlap exons, suggest- 
ing that they act not to disrupt coding sequence but to 
regulate gene expression. The high proportion of eQTLs 
among the CNVs associated with cellular sensitivity to 
each of the drugs further supports the hypothesis that 



these CNVs mediate their phenotypic consequences 
through their effect on the transcriptome. Genome-wide 
studies of pharmacologic phenotypes, such as response 
to antineoplastic agents, may benefit from studies of 
CNVs as eQTLs. 

This study, to our knowledge, is the first comprehen- 
sive genome-wide study of the effect of CNVs, from the 
most extensive array-based and sequencing-based sur- 
veys of these structural variants, on pharmacologic phe- 
notypes. In contrast to a recent disease susceptibility 
study that concluded that most CNVs that are well- 
typed have been indirectly explored by SNP studies [6], 
we found a number of CNVs associated with drug sensi- 
tivity that are independent of SNPs. These CNVs there- 
fore constitute novel genetic variations that have not 
been previously interrogated by SNP-based GWAS of 
pharmacologic phenotypes. Our discovery of drug sus- 
ceptibility-associated variations, in the form of CNVs, 
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that are independent of previous SNP findings and that 
show evidence for altering gene expression as eQTLs, 
suggests that CNVs should be included in comprehen- 
sive pharmacogenomic studies. 

Candidate pharmacogenetic studies on drug metabo- 
lism-related genes, namely CYP2D6, CYP2A6, SULT1A1 
and GSTM1, have documented the effect of CNVs on 
gene activity. Our results strongly support the necessity 
of integrating both SNP and CNV data to tighten the 
genotype-phenotype gap in pharmacogenetic studies. 
While the functional validation we conducted in this 
study may not allow robust predictions, the functional 
characterization of the effect of CCND1 mRNA level on 
cellular sensitivity to etoposide underscores the impor- 
tance of considering the role of the transcripts that are 
the targets of drug susceptibility-associated CNVs (act- 
ing as eQTLs) in conferring drug susceptibility. 

We found a significant overlap (P = 7.7 x 10" ) 
between the CNVs associated with cisplatin and carbo- 
platin. Platinating agents share a similar mechanism of 
therapeutic action and interact with DNA to form inter- 
strand and intrastrand cross-links, leading to cytotoxic 
DNA lesions and eventually apoptosis-induced cell 
death. Our findings strongly support the hypothesis that 
CNV-based mechanisms play a crucial role in determin- 
ing platinum sensitivity. Particularly, we identified a 
duplication that is associated with cellular sensitivity to 
both carboplatin and cisplatin. Furthermore, the CNV 
predicts the expression of glutathione reductase {GSR), a 
gene that has been the subject of several studies on cis- 
platin sensitivity [26,30]. The glutathione pathway is 
involved in the metabolism of platinum compounds, 
which are subject to inactivation by glutathione conjuga- 
tion [27]. 

A significant level of overlap is also observed with the 
topoisomerase II inhibitors. Daunorubicin is a DNA 
intercalator that indirectly interacts with topo II while 
etoposide binds directly to the enzyme. We identified 14 
CNVs associated with both etoposide and daunorubicin 
at P < 0.05. The extent of overlap between the platinat- 
ing agents (as well as between the topoisomerase II inhi- 
bitors) is significantly higher than the level of overlap 
across drug classes (7%). 

There is a general caveat to our findings concerning 
the set of CNVs included in this analysis. The CNVs 
tested for association with cellular sensitivity to drugs 
may be biased towards genotypeable variants; conse- 
quently, many highly complex regions may have been 
excluded. Furthermore, our study makes no assertions 
about low frequency variants. Nevertheless, our findings 
represent the most comprehensive study of the effect of 
common CNVs, from the most extensive map of these 
variants available, on chemotherapeutic susceptibility to 
a wide array of drugs. 



Finally, we provide the results of our genome-wide 
study of CNVs and sensitivity to chemotherapeutic 
agents in a publicly available online database, PACdb. 
Analysis results on our cell-based model are easy to 
query, which should allow investigators to utilize the 
resource as a discovery platform or as a validation tool 
for clinical observations. 

Conclusions 

Our study identified CNVs that predict cellular sensitiv- 
ity to an array of chemotherapeutic agents of heteroge- 
neous molecular therapeutic action. Importantly, several 
of the most significant CNV-drug associations are inde- 
pendent of SNPs; thus, these CNVs provide genetic var- 
iations that have not been previously explored by SNP- 
based GWAS of pharmacologic phenotypes. Further- 
more, our findings show that pharmacogenomic studies 
may be greatly enhanced by studies of CNVs as eQTLs. 
Target genes of CNVs, especially those associated with 
multiple independent CNVs associated with drug 
response, provide robust gene expression signatures of 
chemotherapeutic susceptibility. 

Materials and methods 

In vitro cellular sensitivity to chemotherapeutic agents 

We obtained unrelated HapMap phase II CEU (Utah 
residents with ancestry from northern and western Eur- 
ope) samples from Coriell Institute for Medical Research 
(Camden, NJ, USA). Cell lines were maintained in RPMI 
1640 media supplemented with 15% fetal bovine serum 
(Hyclone, Logan, UT, USA) and 1% 1-glutamine. The 
cell lines were passaged three times per week at a con- 
centration of 350,000 cells/ml at 37°C in a 95% humidi- 
fied 5% C0 2 atmosphere. Cellular sensitivity to drugs 
was measured in these cell lines with increasing concen- 
trations of drug (from carboplatin, cisplatin, daunorubi- 
cin, and etoposide). Cell growth inhibition was evaluated 
using the alamarBlue™ assay (BioSource International 
Inc., Camarillo, CA, USA), as previously described [21]. 
IC50 (the concentration required to inhibit 50% of cell 
growth) was determined by curve fitting of percent cell 
survival against concentrations of the drug. 

A catalog of CNVs 

Recent population-based surveys have mapped thou- 
sands of CNVs with increasing resolution. Such surveys 
have opened up approaches for modeling the relation- 
ship between structural variation and complex traits. 
Efforts to catalog these CNVs are necessary to clarify 
the functional impact of these variants. Here we utilize 
the recent comprehensive survey of CNVs [4] larger 
than 1 kb in size in the human genome, assayed in 
human LCLs from CEU (Utah residents with ancestry 
from northern and western Europe) samples. To 
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investigate further the effect of deletions and to confirm 
our findings on the role of drug-associated CNVs as 
eQTLs, we studied the relationship between as-regu- 
lated transcripts (from Gencode/ENCODE functional 
annotation) and cellular sensitivity to chemotherapeutics 
from a recent comprehensive study based on whole gen- 
ome sequencing data of the 1000 Genomes Project [14], 
which mapped CNVs of 50 bp or larger in size at 
nucleotide resolution. 

Association analysis of CNVs or transcript levels with 
cellular sensitivity to drugs 

For each CNV, genotypes were tested for association 
with cellular sensitivity to each of the drugs separately. 
Linear regression was performed between the copy 
number genotype (as the independent variable) and 
log 2 -transformed IC 50 (as the dependent variable). Ana- 
lysis was done in the statistical computing and graphics 
software R; the Im function was used to fit linear 
models. 

Similarly, to examine the relationship between tran- 
script level and drug-induced cellular sensitivity, a linear 
model was constructed for each drug, as previously 
described [19], between log 2 -transformed gene expres- 
sion and log 2 -transformed IC 50 . Generally, for multiple 
testing adjustment, FDRs were calculated using the q- 
value [23] package in R. Unless otherwise stated, an 
FDR cutoff <0.05 was used as the statistical significance 
threshold. 

Mapping CNVs as expression quantitative trait loci 

SCAN [29] is an online database that makes publicly 
available the results of our eQTL studies, initially on 
single base polymorphisms. Global mRNA expression 
was assayed using the Affymetrix GeneChip Human 
Exon 1.0 ST Array [30]. To map CNVs as genomic loci 
influencing the transcriptome, we conducted linear 
regression on over 13,000 transcript clusters and the set 
of CNVs identified in the HapMap LCLs [31]. 

Simulation studies 

We performed simulations to evaluate enrichment for 
eQTLs among the CNVs associated with cellular sensitiv- 
ity to the drugs included in our study. To empirically 
generate the null distribution, we randomly generated 
sets of SNPs of matching minor allele frequency as the 
original list of CNVs (see Figure SI in Additional file 2 
for MAF distribution of the biallelic CNVs included in 
our study), as previously described [32]. To enable us to 
perform simulations conditional on MAF, we constructed 
non-overlapping MAF bins, each of width 0.05, using the 
MAFs of the SNPs in the HapMap CEU samples. The 
null sets were drawn from the combined platform SNPs 
(Affymetrix 6.0 and Illumina 1M) as well as from the 



entire set of HapMap CEU SNPs. The observed count is 
then compared to the empirically generated distribution 
to get an empirical P-value for the enrichment. 

Functional validation 

We obtained 52 unrelated non-HapMap CEPH (Centre 
d'Etude du Polymorphisme Humain) samples (indepen- 
dent of the discovery cohort consisting of HapMap CEU 
samples) from Coriell Institute for Medical Research. 
Cellular sensitivity to etoposide phenotype was quanti- 
fied as described above with increasing concentrations 
of etoposide treatment (0.02 uM, 0.1 uM, 0.5 uM, and 
2.5 uM for 72 hours). IC50 was determined for each cell 
line. CCND1 mRNA levels were evaluated using a real- 
time quantitative PCR assay in the samples using Taq- 
Man Gene Expression Assays (Applied Biosystems, Fos- 
ter City, CA, USA) on the Applied Biosystems 7500 
real-time PCR system. Primer/probes were obtained 
from Applied Biosystems. The human beta 2M (huB 2 M, 
beta-2 microglobulin; NM_004048; Applied Biosystems 
catalog number 4326319E) was used as endogenous 
control. Relative quantification of gene expression uti- 
lized the 2 ( " AACt) method [33]. 

Additional material 



Additional file 1: Real-time PCR data on CCND1 The table lists the 
real-time PCR. values for CCND1 expression, as measured in the 
independent set of 52 cell lines. 

Additional file 2: The minor allele frequency distribution of the 
biallelic CNVs included in our study. The plot is a histogram of the 
minor allele frequency of the biallelic CNVs that were evaluated in our 
study. 
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